Erik
Boertjes, TNO Information and Communication technology, Delft, The Netherlands,
before July 26th : erikboertjes@hotmail.com
,after July 26th: erik.boertjes@tno.nl [PRIMARY contact]
Student
team: NO
The tool is developed and implemented in June 2008 by TNO specifically
for this challenge. We believe that its use, however, is not limited to this
challenge. It allows to view and interact with the data from three different
viewpoints:
1) Sequence diagram: plots the phone calls
analogue to a UML sequence diagram. Each cell phone is represented by a
vertical line. Each call from one cell phone to another is represented by a horizontal
line between the corresponding vertical lines. Time is from top to bottom: this
allows visualizing the order in time of the calls. A clustering algorithm
(Cluto) puts lines with many calls between them close to each other. This improves
readability of the diagram.
2) Graph: each node is a cell phone, each edge
represent the aggregation of calls between two phones. Aggregation can be
performed per day, or over the complete 10 day period.
3) Color bar fingerprints: each cell phone is
represented by a colored bar that visualizes the locations of the cell towers
from where the calls where made with the phone
Screenshot 1: Sequence
diagram view (click here for high-res version)
Screenshot 2: Graph view (click here for high-res version)
Screenshot 3: Color bar
view (click here for high-res version)
Two Page
Summary: NO (but I
would love to make one in case I am awarded a certificate :-) )
ANSWERS:
Phone-1: What is the Catalano/Vidro social
network, as reflected in the cell phone call data, at the end of the time
period
Phone-2 Characterize the changes in
the Catalano/Vidro social structure over the ten day period.
Detailed Answer:
The first given clue is that Fernando is ID 200. We start by locating
this ID in the graph view, which shows the number of calls between each pair of
phones (nodes) summated over the 10 day period. Thickness of edges represent
number of calls. A force-directed algorithm causes nodes that call each other
frequently to cluster. The tool allows for searching a specific ID. Figure 1
shows that Fernando called with IDs 1,2,3,5,97, and 137. Hovering over the
links shows the amount of calls to and from each node. Fernando has called ID 5
most frequently, so we assume that ID 5 is his brother Estaban.
Figure 1 phone connections with ID 200 (click here for high-res version)
The tool allows (by ticking the +1 box) for displaying connections
that are 1 step away. Figure 2 shows that from Fernando, almost half of all the
phones on the island can be reached through the 6 people that he calls.
Especially nodes 1,3,5 account for this: they have a large number of different connections
as compared to other nodes.
Figure 2 Fernandos social contacts reach many people (click here for high-res
version)
With the 7 IDs at hand selected, we switch to the sequence diagram
view. This view shows the 7 cell phones as vertical lines, and the calls between
them as horizontal arrows (Figure 3). Time is displayed from top to bottom.
Figure 3 sequence diagram of 7 phones and their calls (click here for high-res version)
Hovering over a vertical line highlights the calls from and to the
corresponding cell phone, and shows their number in a window. Between these 7,
ID 5 receives the most calls. Could he be David Vidro, the one coordinating
Paraiso activities? (We think not, see below). Zooming in (Figure 4) reveals an
interesting pattern: ID 1 calls IDs 2, 3 and 5. Besides some calls with
negative duration (which we removed from the database), we found many of these
overlapping calls in the dataset. We even found 2 persons (IDs 138 and 321)
having two calls to each other at the same time (on 2006-06-08 at
Figure 4 sequence diagram zoomed in (click here for
high-res version)
Next, we had a look at the location information given in the dataset. We
used color to indicate location on the island, varying from pink in the
Figure 5 color bar view, red arrow, not in original screenshot, shows
bar for ID 97 (click here for
high-res version)
From this view we learn that:
- Most people make calls from about 1 or 2 different areas.
- Few people, like ID 309, make calls from 5 or more different areas
(very colorful bar).
- Many people, like ID 11, have bars that have a substantial pink and a
substantial light green part, which may indicate that they travel back and
forth from one side of the island to the other (e.g. commuting).
When focusing on our Catalano/Vidro friends we see that:
ID 1 is making calls from cell towers 29 (from a boat) and 11 (city)
ID 2 is making calls in the same area as ID 1, mainly from a boat
ID 3 is making calls from 10 (hills? a lookout point?) and 30 (a boat)
ID 5 seems to be calling mainly from the water, both at the
ID 97 makes all calls from one location, the city most south of the
island (cell tower 22)
ID 137 is traveling between
From this we conclude that ID 97 is not joining the activities in the
field but is rather coordinating high level activities from a location for from
the action. We think that our former suspect ID 5 is coordinating activities on
a mere local level and that ID 97 is David Vidro. Other indications for this
are given by a sequence diagram view of ID 97 and the IDs he calls with (see
Figure 6). There is a lot of activity on day 6 between
Figure 6 calls from and to David (ID 97), labels not part of original
screenshot (click
here for high-res version)
Then we had a look at the changing social structure of the
Catalano/Vidro families. The graph of 2006-06-01 (= day 0) (Figure 7) shows
that at the start of the 10 day period, showing IDs 1,2,3,5 and 200 as a group.
Figure 7 social structure at start of the 10 day period (click here for high-res
version)
Figure 8 sequence diagram: changes in social structure during the
last 6 days (click here
for high-res version)
Figure 8 shows the sequence diagram of the calls between the 7 people during
2006-06-05 2006-06-10. On 2006-06-08 (day 7) there is no contact between
either of the 7 people (white area in sequence diagram). Calling behaviour
before and after that date differ significantly. The sequence diagram shows
that ID 1 and ID 200 have contact with each other and with IDs 2,3,5. IDs 2,3,
and 5 do not have contact with each other. After 2006-06-08 there is no contact
between any of the group ID1,2,3,5 and 200. ID 200 (Fernando) now has contact
with ID 97 and ID 137. Maybe ID 200 made promotion to the higher echelons of
the Paraiso movement, and the old gang fell apart.
Figure 9 Anomaly ID 309 (click
here for high-res version)
We conclude with an anomaly that we found with our tool: browsing
through the graph, we found (among others) ID 309 interesting because of the
high amount of different connections. Figure 9 shows the corresponding sequence
diagram. Only ID 309 is shown together with all phones with a connection to ID
309. The pattern shows that ID 309 mostly receives calls, rather than making
calls. Almost all calls are made during the last 3 days of the 10 day period,
from